Automating fault tolerance in high-performance computational biological jobs using multi-agent approaches

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automating Fault Tolerance in High-Performance Computational Biological Jobs Using Multi-Agent Approaches

BACKGROUND Large-scale biological jobs on high-performance computing systems require manual intervention if one or more computing cores on which they execute fail. This places not only a cost on the maintenance of the job, but also a cost on the time taken for reinstating the job and the risk of losing data and execution accomplished by the job before it failed. Approaches which can proactively...

متن کامل

Fault Tolerance for High-Performance Applications Using Structured Parallelism Models

In the last years parallel computing has increasingly exploited the high-level models of structured parallel programming, an example of which are algorithmic skeletons. This trend has been motivated by the properties featuring structured parallelism models, which can be used to derive several (static and dynamic) optimizations at various implementation levels. In this thesis we study the proper...

متن کامل

Automating Middleware Specializations for Fault Tolerance

General-purpose middleware solutions, by definition, cannot readily support domain-specific semantics without significant manual efforts in specializing the middleware. This paper presents GRAFT (GeneRative Aspects for Fault Tolerance), which is a modeldriven, generative, and aspects-based approach to specialize general-purpose middleware with failure handling and recovery semantics imposed by ...

متن کامل

Fault-Tolerance for High-Performance Multi-Module VLSI Systems Using Micro Rollback

In order to achieve fault tolerance, highly reliable systems often require hardware-supported concurrent error detection for all system components. Checkers are connected in the communication paths from each module to the rest of the system, reducing system performance by requiring either longer clock cycles or additional pipeline stages. The performance penalty of concurrent error detection ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computers in Biology and Medicine

سال: 2014

ISSN: 0010-4825

DOI: 10.1016/j.compbiomed.2014.02.005